Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows
نویسندگان
چکیده
منابع مشابه
Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows
MOTIVATION Typical high-throughput genotyping techniques produce numerous missing calls that confound subsequent analyses, such as disease association studies. Common remedies for this problem include removing affected markers and/or samples or, otherwise, imputing the missing data. On small marker sets imputation is frequently based on a vote of the K-nearest-neighbor (KNN) haplotypes, but thi...
متن کاملLiquid-liquid equilibrium data prediction using large margin nearest neighbor
Guanidine hydrochloride has been widely used in the initial recovery steps of active protein from the inclusion bodies in aqueous two-phase system (ATPS). The knowledge of the guanidine hydrochloride effects on the liquid-liquid equilibrium (LLE) phase diagram behavior is still inadequate and no comprehensive theory exists for the prediction of the experimental trends. Therefore the effect the ...
متن کاملNNH: Improving Performance of Nearest-Neighbor Searches Using Histograms
Efficient search for nearest neighbors (NN) is a fundamental problem arising in a large variety of applications of vast practical interest. In this paper we propose a novel technique, called NNH (“Nearest Neighbor Histograms”), which uses specific histogram structures to improve the performance of NN search algorithms. A primary feature of our proposal is that such histogram structures can co-e...
متن کاملFast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph
We introduce a new nearest neighbor search algorithm. The algorithm builds a nearest neighbor graph in an offline phase and when queried with a new point, performs hill-climbing starting from a randomly sampled node of the graph. We provide theoretical guarantees for the accuracy and the computational complexity and empirically show the effectiveness of this algorithm.
متن کاملFast Reduction of Large Dataset for Nearest Neighbor Classifier
Accurate and fast classification of large data obtained from medical images is very important. Proper images (data) processing results to construct a classifier, which supports the work of doctors and can solve many medical problems. Unfortunately, Nearest Neighbor classifiers become inefficient and slow for large datasets. A dataset reduction is one of the most popular solution to this problem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2007
ISSN: 1460-2059,1367-4803
DOI: 10.1093/bioinformatics/btm220